Glivenko–Cantelli theorem

In the theory of probability, the Glivenko–Cantelli theorem, named after Valery Ivanovich Glivenko and Francesco Paolo Cantelli, determines the asymptotic behaviour of the empirical distribution function as the number of independent and identically distributed observations grows. This uniform convergence of more general empirical measures becomes an important property of the Glivenko–Cantelli classes of functions or sets.[1] The Glivenko–Cantelli classes arise in Vapnik–Chervonenkis theory, with applications to machine learning. Applications can be found in econometrics making use of M-estimators

Contents

Glivenko–Cantelli theorem

Assume that X_1,X_2,\dots are independent and identically-distributed random variables in \mathbb{R} with common cumulative distribution function F(x). The empirical distribution function for X_1,\dots,X_n is defined by

F_n(x)=\frac{1}{n}\sum_{i=1}^n I_{(-\infty,x]}(X_i),

where I_C is the indicator function of the set C. For every (fixed) x, F_n(x) is a sequence of random variables which converge to F(x) almost surely by the strong law of large numbers, that is, F_n converges to F pointwise. Glivenko and Cantelli strengthened this result by proving uniform convergence of F_n to F.

Theorem

\|F_n - F\|_\infty = \sup_{x\in R} |F_n(x) - F(x)| {\longrightarrow} 0 almost surely.[2]

This theorem originates with Valery Glivenko,[3] and Francesco Cantelli,[4] in 1933.

Remarks

Empirical measures

One can generalize the empirical distribution function by replacing the set (-\infty,x] by an arbitrary set C from a class of sets \mathcal{C} to obtain an empirical measure indexed by sets C \in \mathcal{C}.

P_n(C)=\frac{1}{n}\sum_{i=1}^n I_C(X_i),   C\in\mathcal{C}.

Further generalization is the map induced by P_n on measurable real-valued functions f, which is given by

f\mapsto P_nf=\int_SfdP_n=\frac{1}{n}\sum_{i=1}^n f(X_i), f\in\mathcal{F}.

Then it becomes an important property of these classes that the strong law of large numbers holds uniformly on \mathcal{F} or \mathcal{C}.

Glivenko–Cantelli class

Consider a set \mathcal{S} with a sigma algebra of Borel subsets A and a probability measure P. For a class of subsets,

{\mathcal C}\subset\{C:  C \mbox{ is measurable subset of }\mathcal{S}\}

and a class of functions

\mathcal{F}\subset\{f:\mathcal{S}\to \mathbb{R}, f \mbox{ is measurable}\,\}

define random variables

\|P_n-P\|_{\mathcal C}=\sup_{c\in {\mathcal C}} |P_n(C)-P(C)|
\|P_n-P\|_{\mathcal F}=\sup_{f\in {\mathcal F}} |P_nf- P(f)|

where P_n(C) is the empirical measure, P_n f is the corresponding map, and

\mathbb{E}f=\int_\mathcal{S} fdP = P (f) , assuming that it exists.

Definitions

1. \|P_n-P\|_\mathcal{C}\to 0 almost surely as n\to\infty.
2. \|P_n-P\|_\mathcal{C}\to 0 in probability as n\to\infty.
3. \mathbb{E}\|P_n-P\|_\mathcal{C}\to 0, as n\to\infty (convergence in mean).
The Glivenko–Cantelli classes of functions are defined similarly.
\sup_{P\in \mathcal{P}(S,A)} \mathbb E \|P_n-P\|_\mathcal{C}\to 0;
\sup_{P\in \mathcal{P}(S,A)} \mathbb E \|P_n-P\|_\mathcal{F}\to 0.

Theorem (Vapnik and Chervonenkis,[6] 1968)

A class of sets \mathcal{C} is uniformly GC if and only if it is a Vapnik–Chervonenkis class.

Examples

\sup_{P\in \mathcal{P}(S,A)}\|P_n-P\|_{\mathcal C} \sim n^{-1/2}, that is \mathcal{C} is uniformly Glivenko–Cantelli class.

See also

Notes

  1. ^ van der Vaart, A.W. (1998) page 279
  2. ^ van der Vaart, A.W. (1998) page 265
  3. ^ Glivenko, V. (1933). Sulla determinazione empirica della legge di probabilita. Giorn. Ist. Ital. Attuari 4, 92-99.
  4. ^ Cantelli, F. P. (1933). Sulla determinazione empirica delle leggi di probabilita. Giorn. Ist. Ital. Attuari 4, 221-424.
  5. ^ van der Vaart, A.W. (1998) page 268
  6. ^ Vapnik, V.N. and Chervonenkis, A. Ya (1971). On uniform convergence of the frequencies of events to their probabilities. Theor. Prob. Appl. 16, 264-280

References

Further reading

  • Dudley, R. M. (1999). Uniform Central Limit Theorems, Cambridge University Press. ISBN 0 521 46102 2.
  • Shorack, G.R., Wellner J.A. (1986) Empirical Processes with Applications to Statistics, Wiley. ISBN 0-471-86725-X.
  • van der Vaart, A.W. and Wellner, J.A. (1996) Weak Convergence and Empirical Processes, Springer. ISBN 0-387-94640-3.